Supervised Learning for Linking Named Entities to Knowledge Base Entries

نویسندگان

  • Ivo Anastácio
  • Bruno Martins
  • Pável Calado
چکیده

This paper addresses the challenging information extraction problem of linking named entities in text to entries in a knowledge base. Our approach uses supervised learning to (a) rank candidate knowledge base entries for each named entity, (b) classify the top-ranked entry as the correct disambiguation or not, and (c) group together the named entities without a corresponding entry in the knowledge base. We analyze the fundamental design challenges involved in the development of a learningbased entity-linking system, and we provide extensive experimental results for a wide range of methods and feature sets. Our experiments over the datasets from the Text Analysis Conference (TAC) Entity Linking Task demonstrate the effectiveness of supervised learning methods, showing that out-ofthe-box algorithms and relatively simple to compute features can obtain very competitive results.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MSRA at TAC 2011: Entity Linking

The Knowledge Base Population task aims at advancing the state of the art for systems that automatically discover information about named entities and then incorporate this information in a knowledge source. The overall task of populating a knowledge base is decomposed into two related tasks: Entity Linking, where names must be aligned to entities in the KB, and Slot Filling, which involves min...

متن کامل

A neighborhood relevance model for entity linking

Entity Linking is the task of mapping mentions in documents to entities in a knowledge base. One of the crucial tasks is to identify the disambiguating context of the mention, and joint assignment models leverage the relationships within the knowledge base. We demonstrate how joint assignment models can be approximated with information retrieval. We build on pseudo-relevance feedback and use th...

متن کامل

Grounded Knowledge Bases for Scientific Domains

This thesis is focused on building knowledge bases (KBs) for scientific domains. Specifically, we create structured representations of technical-domain information using unsupervised or semi-supervised learning methods. This work is inspired by recent advances in knowledge base construction based on Web text. However, in the technical domains we consider here, in addition to text corpora we hav...

متن کامل

Resolving polysemy and pseudonymity in entity linking with comprehensive name and context modeling

Names are important atomic information carriers in unstructured text. Matching names that refer to the same entities is an important issue in text analysis and a key component in many real world applications. Generally referred to as entity linking, it is defined as a task that aligns a name mentioned in free text to its corresponding entry in a Knowledge Base (KB). The difficulty of the task l...

متن کامل

Language and Domain Independent Entity Linking with Quantified Collective Validation

Linking named mentions detected in a source document to an existing knowledge base provides disambiguated entity referents for the mentions. This allows better document analysis, knowledge extraction and knowledge base population. Most of the previous research extensively exploited the linguistic features of the source documents in a supervised or semi-supervised way. These systems therefore ca...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011